Some packages we will use:

library(tidyverse)
library(leaflet)
library(sf)
library(terra)
library(tidyterra)
library(ggmap)
library(DHARMa)

Koalas

We will look at some records of Koalas I got from the Atlas of Living Australia. I picked a somewhat arbitrary radius including Armidale.

koalas <- read_csv('maps_workshop_material/records-2023-06-20.csv')
## Rows: 650 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (5): dataResourceUid, geodeticDatum, verbatimCoordinateSystem, species,...
## dbl  (4): decimalLatitude, decimalLongitude, verbatimLatitude, verbatimLongi...
## dttm (1): eventDate
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ggplot(koalas,aes(x=decimalLongitude,y=decimalLatitude))+
  geom_point()

Just points are a little boring/unhelpful. One quick/easy option is leaflet:

m <- leaflet() %>% setView(lng = mean(koalas$decimalLongitude),
                           lat = mean(koalas$decimalLatitude),
                           zoom=12)
m %>% addTiles() %>% addMarkers(data = koalas,
                                ~decimalLongitude,
                                ~decimalLatitude,
                                #clusterOptions = markerClusterOptions()
                                )

Note we can do the cluster thing if we want – it is quite helpful.

Interactive maps are cool but we maybe want something more static for papers etc. ggmap can be a bit annoying but is one way:

a_box <- c(left = min(koalas$decimalLongitude) - .01,
          right = max(koalas$decimalLongitude) + .01,
          bottom = min(koalas$decimalLatitude) - .01,
          top = max(koalas$decimalLatitude) + .01)
aMap <- get_stamenmap(a_box,zoom=11,maptype = "terrain")
## ℹ Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL.
ggmap(aMap) + 
  geom_point(data = koalas, 
             aes(x=decimalLongitude, 
                 y=decimalLatitude))

#because of the API its a bit clunky

Perhaps we would like some science. When working with maps there are two main data types you want to know: raster data and vector data. Vector data are shapes – lines, polygons, etc., whereas rasters are things that have pixels. You deal with vectors using sf and rasters using terra (previously you used sp and raster)

I’ve given an example of each: some land cover data in a raster, and some ABS boundaries which are vector polygons.

landCover <- rast('./maps_workshop_material/ga_ls_landcover_class_cyear_2_1-0-0_au_x18y-35_2020-01-01_level4_rgb.tif')
landCover
## class       : SpatRaster 
## dimensions  : 4000, 4000, 3  (nrow, ncol, nlyr)
## resolution  : 25, 25  (x, y)
## extent      : 1800000, 1900000, -3500000, -3400000  (xmin, xmax, ymin, ymax)
## coord. ref. : GDA94 / Australian Albers (EPSG:3577) 
## source      : ga_ls_landcover_class_cyear_2_1-0-0_au_x18y-35_2020-01-01_level4_rgb.tif 
## colors RGB  : 1, 2, 3 
## names       : Red, Green, Blue
abs_poly <- st_read('./maps_workshop_material/SA1_2021_subset.shp')
## Reading layer `SA1_2021_subset' from data source 
##   `C:\Users\rcope4\OneDrive - University of New England\Documents\teaching\misc workshops etc\maps\maps_workshop_material\SA1_2021_subset.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 512 features and 17 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 148.6762 ymin: -31.85823 xmax: 152.6314 ymax: -28.24899
## Geodetic CRS:  GCS_GDA2020
abs_poly
## Simple feature collection with 512 features and 17 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 148.6762 ymin: -31.85823 xmax: 152.6314 ymax: -28.24899
## Geodetic CRS:  GCS_GDA2020
## First 10 features:
##     SA1_CODE21 CHG_FLAG21 CHG_LBL21 SA2_CODE21 SA2_NAME21 SA3_CODE21 SA3_NAME21
## 1  11001118601          0 No change  110011186   Armidale      11001   Armidale
## 2  11001118602          0 No change  110011186   Armidale      11001   Armidale
## 3  11001118603          0 No change  110011186   Armidale      11001   Armidale
## 4  11001118605          0 No change  110011186   Armidale      11001   Armidale
## 5  11001118606          0 No change  110011186   Armidale      11001   Armidale
## 6  11001118607          0 No change  110011186   Armidale      11001   Armidale
## 7  11001118608          0 No change  110011186   Armidale      11001   Armidale
## 8  11001118609          0 No change  110011186   Armidale      11001   Armidale
## 9  11001118611          0 No change  110011186   Armidale      11001   Armidale
## 10 11001118612          0 No change  110011186   Armidale      11001   Armidale
##    SA4_CODE21                 SA4_NAME21 GCC_CODE21  GCC_NAME21 STE_CODE21
## 1         110 New England and North West      1RNSW Rest of NSW          1
## 2         110 New England and North West      1RNSW Rest of NSW          1
## 3         110 New England and North West      1RNSW Rest of NSW          1
## 4         110 New England and North West      1RNSW Rest of NSW          1
## 5         110 New England and North West      1RNSW Rest of NSW          1
## 6         110 New England and North West      1RNSW Rest of NSW          1
## 7         110 New England and North West      1RNSW Rest of NSW          1
## 8         110 New England and North West      1RNSW Rest of NSW          1
## 9         110 New England and North West      1RNSW Rest of NSW          1
## 10        110 New England and North West      1RNSW Rest of NSW          1
##         STE_NAME21 AUS_CODE21 AUS_NAME21 AREASQKM21
## 1  New South Wales        AUS  Australia     1.8617
## 2  New South Wales        AUS  Australia    34.3521
## 3  New South Wales        AUS  Australia    39.1836
## 4  New South Wales        AUS  Australia    46.2758
## 5  New South Wales        AUS  Australia    39.5515
## 6  New South Wales        AUS  Australia     0.6526
## 7  New South Wales        AUS  Australia     0.2007
## 8  New South Wales        AUS  Australia     0.2262
## 9  New South Wales        AUS  Australia    34.9480
## 10 New South Wales        AUS  Australia     0.2459
##                                                   LOCI_URI21
## 1  http://linked.data.gov.au/dataset/asgsed3/SA1/11001118601
## 2  http://linked.data.gov.au/dataset/asgsed3/SA1/11001118602
## 3  http://linked.data.gov.au/dataset/asgsed3/SA1/11001118603
## 4  http://linked.data.gov.au/dataset/asgsed3/SA1/11001118605
## 5  http://linked.data.gov.au/dataset/asgsed3/SA1/11001118606
## 6  http://linked.data.gov.au/dataset/asgsed3/SA1/11001118607
## 7  http://linked.data.gov.au/dataset/asgsed3/SA1/11001118608
## 8  http://linked.data.gov.au/dataset/asgsed3/SA1/11001118609
## 9  http://linked.data.gov.au/dataset/asgsed3/SA1/11001118611
## 10 http://linked.data.gov.au/dataset/asgsed3/SA1/11001118612
##                          geometry
## 1  POLYGON ((151.6426 -30.5008...
## 2  POLYGON ((151.7053 -30.5413...
## 3  POLYGON ((151.6969 -30.4698...
## 4  POLYGON ((151.5733 -30.4850...
## 5  POLYGON ((151.5869 -30.5545...
## 6  POLYGON ((151.6433 -30.4989...
## 7  POLYGON ((151.6744 -30.4954...
## 8  POLYGON ((151.6707 -30.4977...
## 9  POLYGON ((151.7049 -30.5277...
## 10 POLYGON ((151.6876 -30.5070...

Note the ABS polygons are a subset – the data I obtained from the website was every polygon in Australia and quite large.

We can plot them:

ggplot()+geom_spatraster_rgb(data = landCover)
## SpatRaster resampled to ncells = 501264

Note we needed tidyterra for the geom above

ggplot(abs_poly)+geom_sf()

The raster is a bit annoying to work with / extract information from (though there is a lot of information here – 4000x4000 25m land-cover measurements). The sf is a lot easier to deal with, its kinda just a data frame but with one column being the polygons. This means that you can use the standard R tools you use for data wrangling.

The polygons themselves are a little boring; The main reason I’d use ABS polygons is to use ABS data (you have some)

abs_data <- read_csv('maps_workshop_material/2021_SA1_NSW_subset.csv')
## Rows: 512 Columns: 281
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (281): SA1_CODE_2021, F_GrDi_GrCer_GrDi_Mng, F_GrDi_GrCer_GrDi_Pro, F_Gr...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
abs_data <- abs_data %>% select(SA1_CODE_2021,
                                PhDs = P_PoDe_DoctDe_Tot,
                   pop = P_Tot_Tot)

To combine these / make the map, we need to ensure they share a column that has the same values to match on (particularly here, the same type of values). So we do:

abs_poly %>% filter(SA2_NAME21 == 'Armidale') %>%
  mutate(SA1_CODE_2021 = as.numeric(SA1_CODE21)) %>%
  left_join(abs_data) %>%
  ggplot(aes(fill = PhDs/pop)) + geom_sf() + 
  geom_point(data = koalas,
             inherit.aes = FALSE,
             col='red',
             aes(x=decimalLongitude,
                                       y=decimalLatitude))
## Joining with `by = join_by(SA1_CODE_2021)`

CRS

An annoying thing you will often encounter is coordinate references systems (CRS). Because the world is a sphere, when you make a map you need to deform that sphere in some way (to make it flat) which is described by the CRS. Both the objects we imported tell you their CRS, I think epsg.io is helpful here, like epsg.io/3577

You see the problem if you try:

ggplot()+geom_spatraster_rgb(data = landCover)+
  #coord_sf(crs=st_crs(7844))+ #7844 is GDA2020
  geom_point(data = koalas,
             aes(x=decimalLongitude,y=decimalLatitude),
             col='red')
## SpatRaster resampled to ncells = 501264

Because of the CRS this is completely wacky – fortunately the coord_sf() thing lets us fix this easily.

ggplot()+geom_spatraster_rgb(data = landCover)+
  coord_sf(crs=st_crs(7844))+ #7844 is GDA2020
  geom_point(data = koalas,
             aes(x=decimalLongitude,y=decimalLatitude),
             col='red')
## SpatRaster resampled to ncells = 501264

So, try to be mindful of what is happening when you import map objects, and if something weird happens it is probably the CRS.

some computations

Lets try to use these data to fit a model for the number of koalas in an SA1. To do this we need to count the koalas in each SA1, and also take the average of the landcover in each region.

#we need to convert the koala data to a spatial 
#object, then we can use st_intersects()
koalas_st <- st_as_sf(koalas,coords = c('decimalLongitude',
                                        'decimalLatitude'),
                      crs = st_crs(abs_poly))

abs_poly$koalas <- lengths(st_intersects(abs_poly,koalas_st))
#for the landcover, we need to project it so
#it is in the same CRS as the polygons, then
#use extract to take the mean
landCoverProj <- project(landCover, "EPSG:7844")
## 
|---------|---------|---------|---------|
=========================================
                                          
#this will take a while
ex1 <- extract(landCoverProj, abs_poly,mean)

abs_poly$Red <- ex1$Red
abs_poly$Green <- ex1$Green
abs_poly$Blue <- ex1$Blue
armidale <- abs_poly %>% 
  filter(SA2_NAME21 == 'Armidale') %>%
  mutate(SA1_CODE_2021 = as.numeric(SA1_CODE21)) %>%
  left_join(abs_data)
## Joining with `by = join_by(SA1_CODE_2021)`

An extremely naive model would be a Poisson GLM with these linear predictors, and an offset by the log of the area of the regions to take into account the fact that they are of different size. DHARMa lets us consider our modelling assumptions.

glm1 <- glm(koalas ~ Red + Green+ Blue + PhDs+ pop +
              offset(log(AREASQKM21)), data = armidale,
            family = 'poisson')
plot(simulateResiduals(glm1))

This is clearly not a particularly good model (it would take some thought here to decide how to deal with the populations given the areas of the regions etc.), but it is a useful example having done our map computations.